-
Notifications
You must be signed in to change notification settings - Fork 8
Mendel's Genetics #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 2026-01-27-hamming-distance
Are you sure you want to change the base?
Conversation
|
Once the build has completed, you can preview your PR at this URL: https://biojulia.dev/BiojuliaDocs/previews/PR16/ |
Just noting that the comment is being made, but the link doesn't actually work. Probably unrelated to the above, your pull request is for some reason requesting to merge into another branch, rather than into |
kescobo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another solution would be to use StatsBase.jl and do a weighted probability.
One other thing that would be nice to include here is a bit more didactic discussion about how often times we make algorithms that are narrowly tailored, but then we either repeat ourselves or get more complicated as additional requirements get tacked on. Eg, for this problem, your solution works for the specific problem, but we'd have to derive a new equation if the question is something like "What's the probability of a heterozygous offspring?" It also doesn't scale up if we add another trait etc.
Nice thing about the StatsBase.jl solution and even a simulation is that they can be made generic and then can be used to ask more types of questions. I'm not necessarily demanding we add this to a first draft, but maybe open an issue as a potential enhancement.
|
|
||
| !!! warning "The Problem" | ||
|
|
||
| Probability is the mathematical study of randomly occurring phenomena. We will model such a phenomenon with a random variable, which is simply a variable that can take a number of different distinct outcomes depending on the result of an underlying random process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Probability is the mathematical study of randomly occurring phenomena. We will model such a phenomenon with a random variable, which is simply a variable that can take a number of different distinct outcomes depending on the result of an underlying random process. | |
| Probability is the mathematical study of randomly occurring phenomena. | |
| We will model such a phenomenon with a random variable, | |
| which is simply a variable that can take a number of different distinct outcomes | |
| depending on the result of an underlying random process. |
Semantic line breaks please - it makes editing and diffs much nicer.
One way to help remember is to turn off automatic line breaks in your text editor.
|
|
||
| ### Deriving an Algorithm | ||
|
|
||
| Using the information above, we can derive an algorithm using the variables k, m, and n that will calculate the probability of a progeny possessing a dominant allele. We could either calculate the probability of a progeny having a dominant allele, but in this case, it is easier to calculate the likelihood of a progeny having a recessive allele. This is a relatively rarer event, and the calculation will be straightforward. We just have to subtract this probability from 1 to get the overall likelihood of having a progeny with a dominant trait. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Using the information above, we can derive an algorithm using the variables k, m, and n that will calculate the probability of a progeny possessing a dominant allele. We could either calculate the probability of a progeny having a dominant allele, but in this case, it is easier to calculate the likelihood of a progeny having a recessive allele. This is a relatively rarer event, and the calculation will be straightforward. We just have to subtract this probability from 1 to get the overall likelihood of having a progeny with a dominant trait. | |
| Using the information above, we can derive an algorithm using the variables k, m, and n that will calculate the probability of a progeny possessing a dominant allele. We could either calculate the probability of a progeny having a dominant allele, but in this case, it is easier to calculate the likelihood of a progeny having only recessive alleles. This is a relatively rarer event, and the calculation will be straightforward. We just have to subtract this probability from 1 to get the overall likelihood of having a progeny with a dominant trait. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or "having the recessive phenotype".
|
I like the idea of a simulation, though it will generally not give a precisely correct answer for rosalind. I think that's fine if that's explained. |

Making a draft PR here. There's multiple ways to solve the problem, and I added a first approach. I'm thinking that the second would be a more statistical/simulation approach. Basically, based on the values of k, m, n, we can make a vector containing all of the possible organisms (eg. [HH, Hh, hh, HH, etc.]). Then, we can calculate the percentage of dominant individuals/total individuals.
Wanted to run this by you first and see if you had any suggestions on packages to use.